A prototype for projecting HPSG syntactic lexica towards LMF

نویسندگان

  • Kais Haddar
  • Héla Fehri
  • Laurent Romary
چکیده

The comparative evaluation of Arabic HPSG grammar lexica requires a deep study of their linguistic coverage. The complexity of this task results mainly from the heterogeneity of the descriptive components within those lexica (underlying linguistic resources and different data categories, for example). It is therefore essential to define more homogeneous representations, which in turn will enable us to compare them and eventually merge them. In this context, we present a method for comparing HPSG lexica based on a rule system. This method is implemented within a prototype for the projection from Arabic HPSG to a normalised pivot language compliant with LMF (ISO 24613 Lexical Markup Framework) and serialised using a TEI (Text Encoding Initiative) based representation. The design of this system is based on an initial study of the HPSG formalism looking at its adequacy for the representation of Arabic, and from this, we identify the appropriate feature structures corresponding to each Arabic lexical category and their possible LMF counterparts.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Projecting LMF Lexica Towards OWL-DL through LMF-JAPE Patterns to Obtain Interoperable Formats

The development of editors, analyzers, translators and other NLP system types can involve several representation languages. The heterogeneity of representation languages induces the interoperability issue at different levels and in different contexts. In language technology, interoperability proved very crucial nowadays since its lack costs the translation industry a fortune where it is paid pr...

متن کامل

A Syntactic Lexicon for Arabic Verbs

In this paper, we present a modeling of the syntactic lexicon for Arabic verbs based on the Lexical Markup Framework. This ISO standard let us describe the lexical information in a simple way using general guidelines and enable the sharing of resources following the standard. We discuss the syntactic information associated to verbs and the model we propose to structure and represent the entries...

متن کامل

Towards the Fully Automatic Merging of Lexical Resources: A Step Forward

This article reports on the results of the research done towards the fully automatically merging of lexical resources. Our main goal is to show the generality of the proposed approach, which have been previously applied to merge Spanish Subcategorization Frames lexica. In this work we extend and apply the same technique to perform the merging of morphosyntactic lexica encoded in LMF. The experi...

متن کامل

COLDIC, a Lexicographic Platform for LMF compliant lexica

Despite of the importance of lexical resources for a number of NLP applications (Machine Translation, Information Extraction, Event Detection and Tracking, Question Answering, among others), there has been a traditional lack of generic tools for the creation, maintenance and management of computational lexica. The most direct obstacle for the development of such generic tools, that is, independ...

متن کامل

Proposals for a normalized representation of Standard Arabic full form lexica

Standardized lexical resources are an important prerequisite for the development of robust and wide coverage natural language processing application. Therefore, we applied the Lexical Markup Framework, a recent ISO initiative towards standards for designing, implementing and representing lexical resources, on a test bed of data for an Arabic full form lexicon. Besides minor structural accommoda...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • JLCL

دوره 27  شماره 

صفحات  -

تاریخ انتشار 2012